Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Clay tiles and tracer particles were deployed in Mill Creek in Cleveland, OH to investigate how biofilm and streambed materials respond to high flow events. Ten cross-sectional transects were established evenly across a 100-meter reach where cinderblocks with 16 unglazed clay tiles were buried in the streambed near the deepest part of the channel to promote biofilm growth. Particles of sizes corresponding to the 50th, 75th, and 90th percentile of the substrate size classes at each transect were painted and numbered for use as tracer particles. Samples from the tiles were collected after each high-flow event and measured their biomass using chlorophyll a (chla) and ash-free dry mass (AFDM). Movement of tracer particles (yes/no) was recorded to estimate how much of the streambed moved.more » « less
-
Water quality sensors were placed in 3 urban streams in Cleveland, OH and 4 urban streams in Denver, CO to estimate stream metabolism and assess response to high flow events. MiniDOT (dissolved oxygen and temperature) and Onset (specific conductance) sensors were placed mid-channel near USGS gages. Light was measured as global horizontal irradiance (GHI) and supplied by SolCast. Data collection was part of the NSF STORMS project (PI Jefferson, co-PIs Costello, Bhaskar, Turner). Specific conductance, dissolved oxygen, and light were measured every 10 minutes. Sensors were removed during winter months to avoid damage. Datasets were cleaned to remove values when sensors were out of water, buried, and removed for maintenance/calibration.more » « less
-
This dataset contains dissolved organic carbon concentrations from surface water samples collected at 100 urban stream locations in the greater Salt Lake City, Utah metropolitan area. Samples were collected four times (July 2022, October 2022, February 2023, and May 2023) to capture spatial and seasonal variation in DOC concentrations. Filtered stream samples were analyzed for dissolved organic carbon concentration. These data were collected as part of the Carbon in Urban Rivers Biogeochemistry (CURB) Project. Detailed field data and site data are published separately and can be linked using the “curbid” and “synoptic_event” columns in each dataset.more » « less
-
This dataset contains field measurements taken during water sampling from 100 urban stream locations in the greater Salt Lake City, Utah (USA) metropolitan area. Field collection took place during four synoptic sampling events (July 2022, October 2022, February 2023, and May 2023) to capture spatial and seasonal variation in stream conditions (specific conductivity, water temperature, dissolved oxygen, pH, ORP). Filtered stream samples were analyzed for dissolved organic carbon concentration and characteristics, available in a separate dataset. These data were collected as part of the Carbon in Urban Rivers Biogeochemistry (CURB) Project. Detailed field data and site data are published separately and can be linked using the “curbid” and “synoptic_event” columns in each dataset.more » « less
-
Large language models (LLMs) have reshaped the landscape of program synthesis. However, contemporary LLM-based code completion systems often hallucinate broken code because they lack appropriate code context, particularly when working with definitions that are neither in the training data nor near the cursor. This paper demonstrates that tighter integration with the type and binding structure of the programming language in use, as exposed by its language server, can help address this contextualization problem in a token-efficient manner. In short, we contend that AIs need IDEs, too! In particular, we integrate LLM code generation into the Hazel live program sketching environment. The Hazel Language Server is able to identify the type and typing context of the hole that the programmer is filling, with Hazel's total syntax and type error correction ensuring that a meaningful program sketch is available whenever the developer requests a completion. This allows the system to prompt the LLM with codebase-wide contextual information that is not lexically local to the cursor, nor necessarily in the same file, but that is likely to be semantically local to the developer's goal. Completions synthesized by the LLM are then iteratively refined via further dialog with the language server, which provides error localization and error messages. To evaluate these techniques, we introduce MVUBench, a dataset of model-view-update (MVU) web applications with accompanying unit tests that have been written from scratch to avoid data contamination, and that can easily be ported to new languages because they do not have large external library dependencies. These applications serve as challenge problems due to their extensive reliance on application-specific data structures. Through an ablation study, we examine the impact of contextualization with type definitions, function headers, and errors messages, individually and in combination. We find that contextualization with type definitions is particularly impactful. After introducing our ideas in the context of Hazel, a low-resource language, we duplicate our techniques and port MVUBench to TypeScript in order to validate the applicability of these methods to higher-resource mainstream languages. Finally, we outline ChatLSP, a conservative extension to the Language Server Protocol (LSP) that language servers can implement to expose capabilities that AI code completion systems of various designs can use to incorporate static context when generating prompts for an LLM.more » « less
-
This dataset contains turbidity data and storm event characters of three urban watersheds in Cuyahoga County, Ohio. Turbidity data were collected at a frequency of 10 minutes using in-situ Cyclop-7 turbidimeters designed by Turner Designs and integrated with a Cyclops-7 logger by Precision Measurement Engineering, Inc. Data were collected for three years from September 2018 to 2021. Turbidity data is harmonized with instantaneous discharge data from USGS stream gages. Event characteristics contains runoff, precipitation and antecedent characteristics. The data support the findings of the study titled "Urbanization and Suspended Sediment Transport Dynamics: A Comparative Study of Watersheds with Varying Degree of Urbanization using Concentration-Discharge Hysteresis".more » « less
-
This dataset contains tabular data at three scales (city, tract, and synoptic site) and related vector shapefiles (for watersheds or buffers around synoptic sites) for areas included in the Carbon in Urban River Biogeochemistry Project (CURB) to assess how social, built, and biophysical factors shape aquatic functions. The city scale included 486 urban areas in the continental United States with greater than 50,000 residents. Tabular data are provided for each urban area (CURB_CensusUrbanArea.csv) and all U.S. Census tracts within seven urban areas (Atlanta, GA, Boston, MA, Miami, FL, Phoenix, AZ, Portland, OR, Salt Lake City, UT, and San Francisco, CA; CURB_CensusTract.csv) to characterize a range of social, built, and biophysical factors. In six focal cities (Baltimore, MD, Boston, MA, Atlanta, GA, Miami, FL, Salt Lake City, UT, and Portland, OR) up to 100 sites were selected for synoptic water quality sampling. For each synoptic site tabular data (CURB_SynopticSite.csv) are provided to characterize a range of social, built, and biophysical factors within the watershed (Atlanta, Baltimore, Boston, Portland, Salt Lake City) or within a buffer of the site (Miami). Vector shapefiles are provided for the watershed boundaries (CURB_Synoptic_Watersheds.zip) for all synoptic sites in each city except Miami, FL where 400-m buffers (CURB_Miami_Synoptic_Buffers.zip) around the synoptic site were used.more » « less
-
Type systems typically only define the conditions under which an expression is well-typed, leaving ill-typed expressions formally meaningless. This approach is insufficient as the basis for language servers driving modern programming environments, which are expected to recover from simultaneously localized errors and continue to provide a variety of downstream semantic services. This paper addresses this problem, contributing the first comprehensive formal account of total type error localization and recovery: the marked lambda calculus. In particular, we define a gradual type system for expressions with marked errors, which operate as non-empty holes, together with a total procedure for marking arbitrary unmarked expressions. We mechanize the metatheory of the marked lambda calculus in Agda and implement it, scaled up, as the new basis for Hazel, a full-scale live functional programming environment with, uniquely, no meaningless editor states. The marked lambda calculus is bidirectionally typed, so localization decisions are systematically predictable based on a local flow of typing information. Constraint-based type inference can bring more distant information to bear in discovering inconsistencies but this notoriously complicates error localization. We approach this problem by deploying constraint solving as a type-hole-filling layer atop this gradual bidirectionally typed core. Errors arising from inconsistent unification constraints are localized exclusively to type and expression holes, i.e., the system identifies unfillable holes using a system of traced provenances, rather than localized in anad hocmanner to particular expressions. The user can then interactively shift these errors to particular downstream expressions by selecting from suggested partially consistent type hole fillings, which returns control back to the bidirectional system. We implement this type hole inference system in Hazel.more » « less
An official website of the United States government
